Integration Analysis of Diverse Genomic Data Using Multi-clustering Results
نویسندگان
چکیده
In modern data mining applications, clustering algorithms are among the most important approaches, because these algorithms group elements in a dataset according to their similarities, and they do not require any class label information. In recent years, various methods for ensemble selection and clustering result combinations have been designed to optimize clustering results. Moreover, conducting data analysis using multiple sources, given the complexity of data objects, is a much more powerful method than evaluating each source separately. Therefore, a new paradigm is required that combines the genome-wide experimental results of multi-source datasets. However, multi-source data analysis is more difficult than single source data analysis. In this paper, we propose a new clustering ensemble approach for multi-source bio-data on complex objects. In addition, we present encouraging clustering results in a real bio-dataset examined using our proposed method.
منابع مشابه
Using Clustering and Factor Analysis in Cross Section Analysis Based on Economic-Environment Factors
Homogeneity of groups in studies those use cross section and multi-level data is important. Most studies in economics especially panel data analysis need some kinds of homogeneity to ensure validity of results. This paper represents the methods known as clustering and homogenization of groups in cross section studies based on enviro-economics components. For this, a sample of 92 countries which...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملA Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm
Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...
متن کاملHLA and HIV Infection Progression: Application of the Minimum Description Length Principle to Statistical Genetics
Chair: V. Maojo HLA and HIV Infection Progression: Application of the Minimum Description Length Principle to Statistical Genetics Peter T. Hraber, Bette T. Korber, Steven Wolinsky, Henry A. Erlich, Elizabeth A. Trachtenberg, and Thomas B. Kepler Visualization of functional aspects of microRNA regulatory networks using the Gene Ontology Alkiviadis Symeonidis, Ioannis G. Tollis, and Martin Reczk...
متن کاملLocating Ornamental and Medicinal Saffron Cultivation Based on AHP Analysis in GIS Environment in Ardabil Province
The ornamental and medicinal saffron with high economic value encompasses a great share of non-oil exports in Iran. As a result, identification of suitable areas for saffron cultivation in Ardabil province located in northwestern part of Iran necessitate effective planning for expanding cultivation and production of this product. Given the effect of various environmental factors on growth and y...
متن کامل